Goto

Collaborating Authors

 clustering technique


Advanced Clustering Techniques for Speech Signal Enhancement: A Review and Metanalysis of Fuzzy C-Means, K-Means, and Kernel Fuzzy C-Means Methods

arXiv.org Artificial Intelligence

Speech signal processing is a cornerstone of modern communication technologies, tasked with improving the clarity and comprehensibility of audio data in noisy environments. The primary challenge in this field is the effective separation and recognition of speech from background noise, crucial for applications ranging from voice-activated assistants to automated transcription services. The quality of speech recognition directly impacts user experience and accessibility in technology-driven communication. This review paper explores advanced clustering techniques, particularly focusing on the Kernel Fuzzy C-Means (KFCM) method, to address these challenges. Our findings indicate that KFCM, compared to traditional methods like K-Means (KM) and Fuzzy C-Means (FCM), provides superior performance in handling non-linear and non-stationary noise conditions in speech signals. The most notable outcome of this review is the adaptability of KFCM to various noisy environments, making it a robust choice for speech enhancement applications. Additionally, the paper identifies gaps in current methodologies, such as the need for more dynamic clustering algorithms that can adapt in real time to changing noise conditions without compromising speech recognition quality. Key contributions include a detailed comparative analysis of current clustering algorithms and suggestions for further integrating hybrid models that combine KFCM with neural networks to enhance speech recognition accuracy. Through this review, we advocate for a shift towards more sophisticated, adaptive clustering techniques that can significantly improve speech enhancement and pave the way for more resilient speech processing systems.


Targeted demand response for flexible energy communities using clustering techniques

arXiv.org Artificial Intelligence

The present study proposes clustering techniques for designing demand response (DR) programs for commercial and residential prosumers. The goal is to alter the consumption behavior of the prosumers within a distributed energy community in Italy. This aggregation aims to: a) minimize the reverse power flow at the primary substation, occuring when generation from solar panels in the local grid exceeds consumption, and b) shift the system wide peak demand, that typically occurs during late afternoon. Regarding the clustering stage, we consider daily prosumer load profiles and divide them across the extracted clusters. Three popular machine learning algorithms are employed, namely k-means, k-medoids and agglomerative clustering. We evaluate the methods using multiple metrics including a novel metric proposed within this study, namely peak performance score (PPS). The k-means algorithm with dynamic time warping distance considering 14 clusters exhibits the highest performance with a PPS of 0.689. Subsequently, we analyze each extracted cluster with respect to load shape, entropy, and load types. These characteristics are used to distinguish the clusters that have the potential to serve the optimization objectives by matching them to proper DR schemes including time of use, critical peak pricing, and real-time pricing. Our results confirm the effectiveness of the proposed clustering algorithm in generating meaningful flexibility clusters, while the derived DR pricing policy encourages consumption during off-peak hours. The developed methodology is robust to the low availability and quality of training datasets and can be used by aggregator companies for segmenting energy communities and developing personalized DR policies.


Comparative Analysis of Clustering Techniques for Personalized Food Kit Distribution

arXiv.org Artificial Intelligence

The Government of Kerala had increased the frequency of supply of free food kits owing to the pandemic, however, these items were static and not indicative of the personal preferences of the consumers. This paper conducts a comparative analysis of various clustering techniques on a scaled-down version of a real-world dataset obtained through a conjoint analysis-based survey. Clustering carried out by centroid-based methods such as k means is analyzed and the results are plotted along with SVD, and finally, a conclusion is reached as to which among the two is better. Once the clusters have been formulated, commodities are also decided upon for each cluster. Also, clustering is further enhanced by reassignment, based on a specific cluster loss threshold. Thus, the most efficacious clustering technique for designing a food kit tailored to the needs of individuals is finally obtained.


5 Clustering Methods in Machine Learning

#artificialintelligence

In the beginning, let's have some common terminologies overview, A cluster is a group of objects that lie under the same class, or in other words, objects with similar properties are grouped in one cluster, and dissimilar objects are collected in another cluster. And, clustering is the process of classifying objects into a number of groups wherein each group, objects are very similar to each other than those objects in other groups. Simply, segmenting groups with similar properties/behaviour and assign them into clusters. Being an important analysis method in machine learning, clustering is used for identifying patterns and structure in labelled and unlabelled datasets. Clustering is exploratory data analysis techniques that can identify subgroups in data such that data points in each same subgroup (cluster) are very similar to each other and data points in separate clusters have different characteristics.


20 Data Science Interview Questions for a Beginner

#artificialintelligence

Success is a process not an event. Data Science is growing rapidly in all sectors. With the availability of so many technologies within the Data Science domain, it becomes tricky to crack any Data Science interview. In this article, we have tried to cover the most common Data Science interview questions asked by recruiters. Answer: The question can also be phrased as to why linear regression is not a very effective algorithm.


Comparison of Clustering Techniques for Residential Energy Behavior Using Smart Meter Data

AAAI Conferences

Current practice in whole time series clustering of residential meter data focuses on aggregated or subsampled load data at the customer level, which ignores day-to-day differences within customers. This information is critical to determine each customer’s suitability to various demand side management strategies that support intelligent power grids and smart energy management. Clustering daily load shapes provides fine-grained information on customer attributes and sources of variation for subsequent models and customer segmentation. In this paper, we apply 11 clustering methods to daily residential meter data. We evaluate their parameter settings and suitability based on 6 generic performance metrics and post-checking of resulting clusters. Finally, we recommend suitable techniques and parameters based on the goal of discovering diverse daily load patterns among residential customers. To the authors’ knowledge, this paper is the first robust comparative review of clustering techniques applied to daily residential load shape time series in the power systems’ literature.


A Study of FOSS'2013 Survey Data Using Clustering Techniques

arXiv.org Machine Learning

FOSS is an acronym for Free and Open Source Software. The FOSS 2013 survey primarily targets FOSS contributors and relevant anonymized dataset is publicly available under CC by SA license. In this study, the dataset is analyzed from a critical perspective using statistical and clustering techniques (especially multiple correspondence analysis) with a strong focus on women contributors towards discovering hidden trends and facts. Important inferences are drawn about development practices and other facets of the free software and OSS worlds.